136 research outputs found

    Report of the Third Workshop on the Usage of NetFlow/IPFIX in Network Management

    Get PDF
    The Network Management Research Group (NMRG) organized in 2010 the Third Workshop on the Usage of NetFlow/IPFIX in Network Management, as part of the 78th IETF Meeting in Maastricht. Yearly organized since 2007, the workshop is an opportunity for people from both academia and industry to discuss the latest developments of the protocol, possibilities for new applications, and practical experiences. This report summarizes the presentations and the main conclusions of the workshop

    Robust URL Classification With Generative Adversarial Networks

    Get PDF
    Classifying URLs is essential for different applications, such as parental control, URL filtering and Ads/tracking protection. Such systems historically identify URLs by means of regular expressions, even if machine learning alternatives have been proposed to overcome the time-consuming maintenance of classification rules. Classical machine learning algorithms, however, require large samples of URLs to train the models, covering the diverse classes of URLs (i.e., a ground truth), which somehow limits the applicability of the approach. We here give a first step towards the use of Generative Adversarial Neural Networks (GANs) to classify URLs. GANs are attractive for this problem for two reasons. First, GANs can produce samples of URLs belonging to specific classes even if exposed to a limited training set, outputting both synthetic traces and a robust discriminator. Second, a GAN can be trained to discriminate a class of URLs without being exposed to all other URLs classes – i.e., GANs are robust even if not exposed to uninteresting URL classes during training. Experiments on real data show that not only the generated synthetic traces are somehow realistic, but also the URL classification is accurate with GANs. © is is held held by by author/owner(s). author/owner(s)

    Utilización y analisis de herramientas de Big Data para el estudio de registros de conexiones a internet

    Get PDF
    El objetivo de la tecnología Big Data es el estudio de grandes cantidades de datos cuyo análisis tomaría demasiado tiempo en una base de datos tradicional. En este trabajo se emplean estas tecnologías para realizar el estudio de una red de tráfico Internet. Para ello se utilizan las herramientas de Big Data pertenecientes a Apache Hadoop: MapReduce, Spark y Hive. Estas herramientas se encuentran funcionando sobre un cluster de ordenadores ubicado en la Universidad Politécnica de Turín. En esta red se genera una monitorización del tráfico perteneciente al protocolo internet (IP) mediante un sniffer que crea los registros de tráfico sobre los que se trabaja. El primer problema que se plantea es realizar un estudio de las características de las herramientas pertenecientes a Apache Hadoop en su uso para el análisis de los registros de tráfico de red almacenados. Para ello se realizan una serie de pruebas que permiten comprobar sus resultados frente a diferentes tipos de análisis. Al finalizar el estudio de estas herramientas, se realiza un análisis sobre el tráfico IP almacenado para caracterizar los protocolos utilizados en la red y el tráfico generado. Debido a que la mayoría del tráfico registrado pertenece al protocolo de transferencia de hipertexto (HTTP), se estudia la relación que tiene en los servicios web modernos el dominio visitado y las direcciones IP utilizadas

    Personal Cloud Storage Benchmarks and Comparison

    Get PDF
    The large amount of space offered by personal cloud storage services (e.g., Dropbox and OneDrive), together with the possibility of synchronizing devices seamlessly, keep attracting customers to the cloud. Despite the high public interest, little information about system design and actual implications on performance is available when selecting a cloud storage service. Systematic benchmarks to assist in comparing services and understanding the effects of design choices are still lacking. This paper proposes a methodology to understand and benchmark personal cloud storage services. Our methodology unveils their architecture and capabilities. Moreover, by means of repeatable and customizable tests, it allows the measurement of performance metrics under different workloads. The effectiveness of the methodology is shown in a case study in which 11 services are compared under the same conditions. Our case study reveals interesting differences in design choices. Their implications are assessed in a series of benchmarks. Results show no clear winner, with all services having potential for improving performance. In some scenarios, the synchronization of the same files can take 20 times longer. In other cases, we observe a wastage of twice as much network capacity, questioning the design of some services. Our methodology and results are thus useful both as benchmarks and as guidelines for system design

    Report from the 6th PhD School on Traffic Monitoring and Analysis (TMA)

    Get PDF
    This is a summary report by the organizers of the 6th TMA PhD school held in Louvain-la-Neuve on 5-6 April 2016. The insight and feedback received about the event might turn useful for the organization of future editions and similar events targeting students and young researchers

    Impact of Access Line Capacity on Adaptive Video Streaming Quality - A Passive Perspective

    Get PDF
    Adaptive streaming over HTTP is largely used to deliver live and on-demand video. It works by adjusting video quality according to network conditions. While QoE for different streaming services has been studied, it is still unclear how access line capacity impacts QoE of broadband users in video sessions. We make a first step to answer this question by characterizing parameters influencing QoE, such as frequency of video adaptations. We take a passive point of view, and analyze a dataset summarizing video sessions of a large population for one year. We first split customers based on their estimated access line capacity. Then, we quantify how the latter affects QoE metrics by parsing HTTP requests of Microsoft Smooth Streaming (MSS) services. For selected services, we observe that at least 3~Mbps of downstream capacity is needed to let the player select the best bitrate, while at least 6~Mbps are required to minimize delays to retrieve initial fragments. Surprisingly, customers with faster access lines obtain limited benefits, hinting to restrictions on the design of services

    Measuring Web Speed From Passive Traces

    Get PDF
    Understanding the quality of Experience (QoE) of web brows- ing is key to optimize services and keep users’ loyalty. This is crucial for both Content Providers and Internet Service Providers (ISPs). Quality is subjective, and the complexity of today’s pages challenges its measurement. OnLoad time and SpeedIndex are notable attempts to quantify web performance with objective metrics. However, these metrics can only be computed by instrumenting the browser and, thus, are not available to ISPs. We designed PAIN: PAssive INdicator for ISPs. It is an automatic system to monitor the performance of web pages from passive measurements. It is open source and available for download. It leverages only flow-level and DNS measurements which are still possible in the network despite the deployment of HTTPS. With unsupervised learn- ing, PAIN automatically creates a machine learning model from the timeline of requests issued by browsers to render web pages, and uses it to measure web performance in real- time. We compared PAIN to indicators based on in-browser instrumentation and found strong correlations between the approaches. PAIN correctly highlights worsening network conditions and provides visibility into web performance. We let PAIN run on a real ISP network, and found that it is able to pinpoint performance variations across time and groups of users

    Lost in Translation: AI-based Generator of Cross-Language Sound-Squatting

    Get PDF
    Sound-squatting is a phishing attack that tricks users into accessing malicious resources by exploiting similarities in the pronunciation of words. It is an understudied threat that gains traction with the popularity of smart-speakers and the resurgence of content consumption exclusively via audio, such as podcasts. Defending against sound-squatting is complex, and existing solutions rely on manually curated lists of homophones, which limits the search to a few (and mostly existing) words only. We introduce Sound-squatter, a multi-language AI-based system that generates sound-squatting candidates for a proactive defence that covers over 80\% of exact homophones and further generates thousands of high-quality approximated homophones. Sound-squatter relies on a state-of-art Transformer Network to learn transliteration. We search for Sound-squatter generated cross-language sound-squatting domains over hundreds of millions of emitted TLS certificates compared with other types of squatting candidates. Our finding reveals that around 6% of generated sound-squatting candidates have emitted TLS certificates, compared to 8% of other types of squatting candidates. We believe \Sound-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against sound-squatting

    Inside Dropbox: Understanding Personal Cloud Storage Services

    Get PDF
    Personal cloud storage services are gaining popularity. With a rush of providers to enter the market and an increasing of- fer of cheap storage space, it is to be expected that cloud storage will soon generate a high amount of Internet traffic. Very little is known about the architecture and the perfor- mance of such systems, and the workload they have to face. This understanding is essential for designing efficient cloud storage systems and predicting their impact on the network. This paper presents a characterization of Dropbox, the leading solution in personal cloud storage in our datasets. By means of passive measurements, we analyze data from four vantage points in Europe, collected during 42 consecu- tive days. Our contributions are threefold: Firstly, we are the first to study Dropbox, which we show to be the most widely-used cloud storage system, already accounting for a volume equivalent to around one third of the YouTube traffic at campus networks on some days. Secondly, we characterize the workload typical users in different environments gener- ate to the system, highlighting how this reflects on network traffic. Lastly, our results show possible performance bot- tlenecks caused by both the current system architecture and the storage protocol. This is exacerbated for users connected far from control and storage data-center
    corecore